arXiv:math/0503434vl [math.ST] 21 Mar 2005 


A stochastic approximation algorithm with 
multiplicative step size adaptation 


Alexander Plakhov Pedro Cruz 

plakhov@mat.ua.pt jpedro@mat.ua.pt 

Department of Mathematics 
University of Aveiro — Portugal 


Abstract 

An algorithm of searching a zero of an unknown function ip ■. R —> R is 
considered, xt = Xt-i-'yt-iyt, t = 1, 2,.where yt = <p{xt-i)+^t is the 
value of p measured at Xt-i with some error, is this error. The step sizes 
7 t > 0 are random positive values and are calculated according to the rule: 

7 t = min{M 7 t_i, g} if yt-iyt > 0, and jt = dyt-i, otherwise. Here 0 < 
d < 1 < u, g > 0. The function p may have one or more zeros; the random 
values are independent and identically distributed, with zero mean and 
finite variance. Under some additional assumptions on p, ^t, and g, the 
conditions on u and d guaranteeing a.s. convergence of the sequence 
{xt}, as well as the conditions on u, d guaranteeing a.s. divergence, 
are determined. In particular, if P(U > 0) = P(U < 0) = 1/2 and 
P(U = x) = 0 for any x € R, it is established that for ud < 1, convergence 
takes place, and for ud > 1, divergence. Due to the multiplicative rule 
of updating of 7 t, it is natural to expect that {xt} converges rapidly: 
like a geometric progression (if convergence takes place), but the limit 
value may not coincide with, but instead, approximates one of zeros of p. 

By adjusting the parameters u and d, one can reach necessary precision 
of approximation; higher precision is obtained at the expense of lower 
convergence rate. 

Key words: stochastic approximation, accelerated convergence algorithms, 
step size adaptation. 

AMS subject classification: 62L20 (Stochastic approximation), 90C15 
(Stochastic programming), 93B30 (System identification) 

1 Introduction 

Consider the problem of finding a zero of a function (/? : K. ^ R. If there 
are several zeros, it is required to find at least one of them. It is supposed 
that the function can be measured at any point, with some random error. The 
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standard algorithm of stochastic approximation consists in calculating successive 
approximations of the required value, xq, xi, X2, ■ ■ ■, according to the rule 

xt = xt-i - jt-iyt, t=l, 2 ,..., (1) 


where 

yt = (fiixt-i) + (2) 

is the value of ip measured dX Xt-i, is the measurement error; 70, 71, 72, • ■ • 
is the sequence of step sizes of the algorithm. Usually it is assumed that the step 
sizes are positive real numbers satisfying the relations = 00, ^ 

Then, under some additional assumptions on p and the algorithm a.s. con¬ 
verges to a zero point of p (see, e.g., PEI) In practice, however, the conver¬ 
gence rate of this algorithm may prove to be unsatisfactory, therefore, when 
solving practical tasks, various modifications of the algorithm are used. There 
are widely utilized heuristical algorithms using random, rather than determinis¬ 
tic, step size, which is corrected in the course of the algorithm, according to the 
current data miEiiiii. In particular, there is used the idea that prescribes to 
decrease the step size if the sequence of increments xt — Xt-i changes the sign 
often enough, indicating that the current value Xt is close to the set of zeros of 
p, and hence, the measurement error of the function is big enough with re¬ 
spect to the function itself p{xt-i). Alternatively, one should increase the step 
size, or leave it unchanged. So, Kesten in the theoretical work [7] considered an 
algorithm using o, and the rule of modification of 74: 


7t=7(s0) St 


st-i if yt-iyt > 0 

st_i -I- 1 if yt-iyt < 0 , 


where so = 0, si = 1; 7(0), 7(1), 7(2),... is a sequence of positive numbers 
satisfying the relations X) 7 ('' 7 r) = 00, X) 7 ^(^) < 00. Thus, the step size cannot 
increase in the course of algorithm; it can only decrease or remain unchanged. 
It is supposed that there is a unique zero of p. Kesten proved that Xt a.s. 
converges to this zero point. A multidimensional version of this algorithm is 
considered in |H]. 

There are also heuristical procedures (in particular, in artificial neural net¬ 
works) , where at each moment t the step size is multiplied by a positive constant 
less than 1, if the measurement data indicate that Xt is close enough to the zero 
set of p, and by a constant more than 1, elsewhere 01 0 ini El ■ This kind of 
rules ensure sufficiently high convergence rate, however the step size converges 
like a geometric progression, therefore X) 7* <00, which means that the limit of 
{xt} need not be a zero point of p, but instead, the sequence may ’’get stuck” on 
its way to the set of zeros of p. Nevertheless, such a procedure may be justified 
if it gives a value close enough to one of the zeros of p. 

In the present paper, a stochastic approximation algorithm utilizing this rule 
of step size modification is considered. Namely, the rule O, 0, jointly with 
the following rule 

J min{u7t_i, g} if yt-ij/t > 0, , ia\ 

if <0, * = . <“> 
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is used. Here 0 < d < 1 < u, 0 < 70, 71 < g, g is a positive constant. Let us 
point out the main differences between 0 and Kesten’s rule 0 - First, accord¬ 
ing to 0 , 7t can both decrease and increase. Second, in Kesten’s algorithm 
one always has =00. On the other hand, it looks likely that in the case 
of convergence of the algorithm 0 , 0 , 0 , 7t converges like a geometric pro¬ 
gression (this conjecture will be justihed in the section 3 ), therefore the limit of 
algorithm may not be a zero point of Lp. 

Suppose that {^t} is a sequence of i.i.d.r.v. with zero mean, besides P(^t > 
0 ) = P(^t < 0 ). Under some additional assumptions on and g, stated 

below, the process defined by O, 0, 0 a.s. diverges \iud> 1, and converges 
liud < 1 , moreover the limit of {cct} belongs to U (). HereZ^(A), 0 < A < 1 , 
is a monotone decreasing family of sets of real numbers, besides every set U (A) 
contains the set Z of zeros of (^, and d{U{\), 7 ?) —> 0 as A —*■ 1 “. (Here by 
definition d{A, B) = sup^.^^ infygs \x — y\ for any two sets of real numbers A 
and B.) This statement is a consequence of the main theorem, which will be 
stated in section 2 and proved in section 3 . Thus, by adjusting the parameters 
u and d (for example, fixing u and letting d 1 /u — O), one can reach necessary 
precision of the algorithm; higher precision is obtained at the expense of lower 
convergence rate. 


2 Definition of the algorithm and statement of 
the main result 

Consider the algorithm given by 0, 0, 0. The rule 0 means that at each 
instant t, step size is multiplied by u or by d, if the result of multiplication is 
less than g; otherwise, step size is set to be g. Thus, the maximal possible value 
of step size equals g. 

The rule 0 can be written in the form 

In7t=ln7t_i -I- Inu • I(?/t_i2/t > 0 )-|-In d • II(?/t-i?/t < 0 ), , . 

In7t=min{ln7t,lng}. 

Let us take the following assumptions: 

A 1 Denote Tt, t = 0 , 1 , 2 ,... the cr-algebra generated by Xi, 7^, and ^i, 0 < i < 
t; then does not depend on Tt- 

A2 The values are identically distributed, with zero mean and finite variance: 
E^t = 0 , Var^t =: S < -boo. 

A3 (a) There exists L > 0 such that for any interval I C [—L, T], P(^i G I) > 0 ; 

(b) p(ei = 0) = 0. 

A4 (fi G C^(R) and sup,^ \ip'{x)\ =: M < 00. 

A5 g < 2 /M. 
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A6 There exists R > 0 such that 


(a) xip{x) > 0 as |a;| > R, and 

■ f 2/ ^ ^ 

b) mt (zj (x) > -— T- 

^ kl>-R ^ ^ 2 - gM 

Remark 1 From and A 6 (a) it follows that the set Z is non-empty and is 
contained in {—R, R). 


Remark 2 Note that assumptions Af-Ad guarantee convergenee of the de¬ 
terministic counterpart of algorithm OP (that is, of the algorithm 

with = 0 j. Moreover, under these conditions, any deterministic algorithm 
Xt = Xt-i—^t-iFi^t-i) converges, whatever the sequence {74} satisfying < g- 


Introduce the functions: 


k+{z) := lim sup{P((75i + Ci)(‘/?2 + 6) > 0), - z\ < e, \ip2 - A < e}, (6) 

e^0+ 


fc_(z) := lim inf{P(((^i + + 6) > 0 ), \<pi-z\<e, \'p2 - z\ < f-)] ( 7 ) 

e^0+ 

one has fc+(z) > 1/2, 0 < k±(z) < 1, lim^^oo k±(z) = 1 . 

Further, define the sets of real numbers 

:= {x : k±((p(x)) < a}, := {x : k±((p(x)) < a}; (8) 

obviously, C C for any a. 

Note that is open. Indeed, let x G then there exists e > 0 such 
that 


sup{P((75i + ii){'P2 + 6) > 0 ), \Fi - f{x)\ < e, |v?2 - f{x)\ < e} c < a. 
Then for x' close enough to x one has \'p{x') — ^p{x)\ < e/ 2 , hence 

sup{P((v?i+6)(V52+6) > 0 ), l‘/5i< e/2, |(/32 - < e/2} < c < o. 


This implies that kj^{ip{x')) < a, hence x' G 


Denote also 


k := 


ln(l/d) 


( 9 ) 


\n{u/d) 

Denote by Z the set of zeros of tp, i.e., Z := {x : (p{x) = 0 }. Suppose that 
X G Xt-2 G {x — e, X -\- e) C and jt-i < e, where e is a small 

positive number. Then, with a probability close to 1 , Xt-i also belongs to a 
small (possibly larger) neighborhood of x contained in and taking into 

account © and 0, one gets 

F{yt-iyt > 0 \xt-2 - a;| < e,jt-2 < e) = 

= P{{ip{xt-2) + 6-i)(<p(a;t-i) +^t) > 0 \xt-2 - x\ < e,7t_2 < e) < k. 
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Then, using © and one obtains 


E[ln7t - ln7t_i \xt-2 - x\ < e, 7 t _2 < e] < 

Inu • P{yt-iyt > 0 \xt-2 - x\ < < e) + Ind • P{yt-iyt < 0 \xt-2 - a::| < e, 7 t -2 < e) 

< Inu • k + Inc? • (1 — k) = 0 . 


Thus, in a sense, the set can be regarded to be a domain of decrease of 
step size: if several consecutive values of Xt belong to and are close enough 
to each other, and if the first term of the sequence of corresponding step sizes 
7t is small enough, then the sequence of their mean values E^t decreases. 

Now, suppose that a; G R \ , Xt-2 G (a; — e, a; + e) C M \ , and that 

7t_2 < e. Analogously, for e small enough, one has 


P{yt-iyt > 0 


|a;t-2 - a;| < e,^t-2 < e) > k, 


and then, using again © and ® and taking into account that for e < gfu^, 
It = 7t) one obtains 


E[ln7t - ln7t_i |a;t_2 - x\ < e,^t-2 < e] = 

Inu • P{yt-iyt > 0 |a;t_2 - x\ < e,7t_2 < e]) + Inc? • P{yt-iyt < 0 |a;t_2 - x\ < e,7t_2 < e]) 
>lnu-k + lnc?-(l — k) = 0 . 


Thus, the set K \ can be regarded as a domain of increase of step size: if 

several consecutive values of xt belong to R \ and are close enough to each 
other, and if the first of the corresponding values of 7t is small enough, then the 
sequence of their mean values E'jt increases. 

Note that if k > A:+( 0 ) then, by virtue of ©, Z C that is, all the 

zeros of (p belong to the region of decrease of step size. On the other hand, if 
k < infj k-{z) then eI'^^ = 0, which means that the region of increase of step 
size coincides with R. 

It seems likely that in the first case the algorithm can converge, and in the 
second one, cannot. This conjecture is confirmed by the following theorem, 
which is the main result of the paper. 

Theorem Let the assumptions A 1 -A 6 be satisfied; consider the process 
{xt, 7t} defined by ©), ©, ©. Recall that\ = . Then 

(a) //k > A:_|_(0) then {xt} a.s. converges to a point from eI'^^ 

(b) //k < vaizk-{z) then {xt} a.s. diverges. 

Suppose that P(^i = x) =0 for any real x and that P( 5 i > 0 ) = P(^i < 0 ). 
Then the function k{-) := fc+(-) coincides with fc-(-), is continuous, and is given 

by 

k{z) = P((2; + ^i)(z + ^2) > 0); 


5 










z = 0 is the unique minimum of fc(-), and fc( 0 ) = inf^ k(z) = 1 / 2 . After a simple 
algebra, one can rewrite the hypotheses of theorem in the form (a) ud < 1 , (b) 
ud > 1 . Denote U{X) := = {x : k{(p{x)) < j^}; 1 < A < 1 is a 

monotone decreasing family of sets containing Z and tending to Z as A —> 1 “. 
Thus, one comes to 

Corollary Let, in addition to assumptions A 1 -A 6 , P(^i = x) = 0 for any 
a: G K., and P(^i > 0 ) = P(^i < 0 ) = 1 / 2 . Consider the process defined by 
Q), 0), Then there exists a monotone decreasing family of sets hl{X), 

0 < A < 1 such that U{X) D Z, d{lA{X), Z) ^ 0 as X ^ 1 ~, and 

(a) if ud < 1 then {xt} a.s. converges to a point from 

(b) if ud> 1 then {xt} a.s. diverges. 


Remark 3 Theorem does not give any information about behavior of the algo¬ 
rithm for the values u, d such that 


inf 2 k- (z) < 


ln(l/(i) 

ln(ii/(i) 


< fc+(0). 


In particular, under the hypotheses of corollary, the case ud = 1 remains unex¬ 
plored. These issues will be addressed elsewhere. 


3 Proof of theorem 

First we prove 10 auxiliary lemmas, and then, basing on them, we prove theorem. 

Here all statements about random variables are supposed to be true almost 
surely. 

In the sequel, we shall mainly designate random values by Greek letters, and 
real numbers and functions from R to R, by Latin ones; the letters t, i, j, s will 
denote integer non-negative numbers. The function ip and the random values 
Xt, yt are exceptions; also, traditional notation e, 5 for small positive numbers 
will be used. 

Lemma 1 Ifjf,t^t ^ then the sequence {xt\ converges. 

Proof. Note that without loss of generality one can assume that a;o is 
bounded. Indeed, replacing Xq by xq = Xq ■ I(|a;o| < X) changes the pro¬ 
cess only with probability P(|a;o| > AT). By taking X large enough, one can 
make this probability arbitrarily small. 

Let C > 0 ; define the stopping time tc = inf{t : X]i=o ^ introduce 

the new process xp, yf by 

xf = Xt, yf = jtast< Tc, and 
xf = Xrc, yf = 0 as t > Tc. 

First, let us prove that the sequence {x^} is bounded. Designate M/j := 
sup|3;|>tj from A 4 it follows that Mr < oo. One has 

\xf\ < \xf_i - \ (10) 
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Using that < C and \Lp{xt-i)^\ < |v5(0)| + M\x^_-^\, one obtains 

\x?\ < \x?_,\il + CM) + 7f_i(|<p(0)| + 161 ). ( 11 ) 

If < 2 /Mfi, an even more precise estimate for xf can be obtained. We 
shall distinguish between two cases: (i) |xt-i| < R and (ii) \x^_i\ > R. 

In case (i), designating b := sup|^|<^ lv^(2:)|, one has 

\x^_i - 7t-iV3(a;f-r)l < \xt-i\ + (12) 


In the case (ii) one has 


^ i^iXf 1 ) 2 

0 < = 2 , 

xr_i Mr 


hence 

1 x^1 - < \x^_i\. ( 13 ) 

Thus, in both cases (i) and (ii), from (Cnil, 112), and m one gets 

|a:f| < Ix^il+7t-i(fe+161). (14) 

The overall number of values of t such that < 2 /Mr is less than CMr/ 2 -, 
therefore, using m and one concludes that 

|xf|< |^|a:o|+^ 7 f-i(&+lv^( 0 )| + |6l)^ •(l + C^M)^"“/6 ( 15 ) 

Denote Cq := &+ 1 :/ 5 ( 0 )|+E|6| and Ct := |6 |-E|6 I; using that J 2 T ^ ^ 
one gets 

\x^\ < |^|xo|+Cco+^7 ,-i6 ^ •(l + CM)^^«/2. ( 16 ) 

Using that E( 7 t^iCt)^ = E^i • Sr'E(7tl-i)^ < oo, one obtains that the 
martingale if-iQ is bounded; the value xg is also bounded, so, by m , one 
concludes that the sequence {xf} is bounded. 

Now, let us show that {x/^} converges. From the definition of xf and jf it 
follows that 

xf = xo -'^-/f_Mxf-i) - X^T-rC*. 

1 1 

Using that the sequence {(fi{xf_i)} is bounded and that 'yf_^ < C, one gets 
that the series "lf-i'^{xf_fj converges. Further, one has 




5 .^F(7f_i)2 <oo, 
1 
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hence the martingale converges. This implies that {a;p} also con¬ 

verges. 

Define the events Ac = {^tlt < C} and A^o = < o®}- One has 

Aoo = Ac Ac- If — O' then xp = Xt for any t; this means that I(Ac) • 

(a;p — Xt) = 0 for any t and C. The sequence { I{Ac)xf} converges, therefore 
the sequence { I{Ac)xt} also converges, and passing to the limit C ^ oo one 
obtains that { I(^oo)a;t} converges. This means exactly that if 74 < oo then 
{xt} converges. □ 

Lemma 2 If limj^oo Xt = x then x G . 


Proof. Note that, using A 3 (a), it is easy to show that there exists ( 5 o > 0 such 
that P(^i ^ [x — L/2, X + L/2]) > 60, whatever a; S K. 

Next, for any x ^ there exist w{x) > 0 and 0 < e(a;) < L /4 such 

that the following holds: for any two random variables (pi and p2 satisfying the 
relations \pi — ‘.p{x)\ < e{x), I = 1,2 one has 


P((^^i +6)(02 +6) > 0) > 


ln(l/c?) -I- w{x) 
Inu -I- ln(l/(i) 


Choose a countable set of intervals [/i = ((p(xi)—e(xi), (p(xi)+e(xi)) covering 
the set \ and denote Wi := w(xi). Fix i and s G { 0 , 1 , 2 ,...}, and 

define the auxiliary process by formulas: 


if t < s then 

x|**^ = Xt, and 

if t > s then 


xf ) = i 

' (is) (is) (is 

xl_{ - yl 

> if ^(xiLl-7l!.l2/f))GC/i, 

(17) 

‘ 1 

Xt 

elsewhere; 

(is) 

Vt = 

:b’(xi!!l) -b6. 


( 18 ) 

II 

/ niin{u7j^!f], g} 

1 

if 

if yi-U-^<0. 

( 19 ) 

So, as t> s, 

is forced to be contained in Ui. 



For t > s -b 2 , using that = (pix["fl) + ^t-i, + 6. 

G Ui, one obtains that 


and 


p(y(!!lyr)>0)> 


\n{l/d) + Wi 
Inrt -b ln(l/fi) 




In M — Wi 
Inw -b ln(l/fi) ’ 


E[lnM- > 0) -b Ind- j/f < 0)] > 


hence 



ln(l/(i) + Wi \nu-w^ 

Consider variables (^i = /i(^i,^2) and (j)2 = /2(Ci,C2) providing a solution 
of the (deterministic) minimization problem: 

(<(>i + Ci)(<A 2 + 6) ^ min, 

subject to 

1^1 - < e{xi) 

102 - '^{xi)\ < e{xi), 

and denote = /i(6-i,6) + Ct-i, = /2(6-i,6) + 6, = Inu • 

> 0 ) + Ind • < 0 ). One has 

(i) ?7t < Inu • > 0) + Ind • < 0); 

(ii) rjt are identically distributed, and Er^t > wi] 

(iii) the set of random variables {ryt, t even, t > s + 2} as well as the set 
{r/t, t odd, t>s + 2}, are mutually independent. 

From (ii)-(iii) it follows that almost surely ry = +oo, and from (i) it 
follows that 


^[Inu • y[""^ > 0) + Ind • < 0)] = +oo, 

t 

SO, by virtue of m, does not go to zero. 

Thus, there exists a random value x > 0 such that for infinitely many values 
of t, yf> X. 

Define a sequence of stopping times tq, ti, T2, ... inductively, letting tq = 0 
and Tj = inf{t > rj_i : y^**^ > x} for j > 1 . The events Bj = {I^t^+i + 
ip{xi)\ > T/2} happen with probability more that 5 q (recall the remark done 
in the beginning of proof), and every event Bj^ J > 2 does not depend on the 
set of events {Bi,... ,Bj-i}. Therefore, for infinitely many values of j, Bj, 
takes place, i.e., |Ctj+i + ‘p{xi)\ > T/2, and hence, taking into account that 
l2/r^+i| > ICr^+i +<p{xt)\ - \(p{xr,j) “ '^[xi)\ and \ip[Xr^)- ^{xi)\ < e{xi) < L/ 4 , 
for these values of j one has |?/rj+i| > T/ 4 . Thus, one concludes that 

for infinitely many values of j, \^rjyTj+i\ > xf^/ 4 - (20) 

Suppose that Xt converges to a point from R \ , then for some i and s one 

has Xt G Ui as t > s, hence the process x[^^\ y^**^ coincides with xt, yt, and 
therefore yt yt+i —*• 0 as t —*■ oo. The last relation contradicts P1| . thus Lemma 
2 is proved. □ 

Lemma 3 Let X^t 7 * = Then for any open set O containing Z there exists 
a positive constant g = g{ 0 ) such that either (i) for some t, Xt G O, or (ii) for 
some t, \xt \ < R and yt > g. 
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Proof. Designate by / the primitive of (p such that inf^; f{x) = 0 . Define the 
stopping time 

r = T{0,g) := inf{t : either (i) Xt G O, or (ii) \xt \ < R and 7* > g}. 

The value of 5 G ( 0 , g) will be specified below. 

Consider the sequence Et = Fi[f{xt) I(t < r)]. Introducing shorthand nota¬ 
tion f{xt) =■■ ft, I(t < r) =: It, f'ixt) =■■ fl = pt, and using that R < R-i, 
one gets 


Et — Et-i — E[/t It — ft-i It-i] < E[(/t — ft-i) It-i]- ( 21 ) 

Next, we utilize the Taylor decomposition 

ft = fixt-i - "ft-iVt) = ft-i - f't-i It-iVt + ^ f'\x') -ft-iVt, 

x' being some point between Xt-i and Xt- Substituting yt = pt-i + £,t and 
recalling that //_! = (fit-i and f''{x') = p^{x^) < M, one obtains 

ft - ft-i < -7t-i Pt-i{pt-i +it) + Y T't-i (^‘-1 + (22) 

Using lEO and and taking into account that each of the values 7t-i, Pt-i, 

It_i is mutually independent with ^t (see Al), one gets 

Et - Et-i < E[(-7t_i ipt-i£.t + ^It-i Pt-i + pt-iit + ^It-i &) It-i] 

= E[(-v3Li + Pt-i + ^It-iSht-i If-i] = 

= E[(-^2_i(l - M^t-il2) + M7t_i5/2)7t_i It_i]. 

( 23 ) 

If If_i = 1 then either (i) Xf_i G [—i?, i?]\C> and7t_i < g, or (ii) |a;t-i| > 

R. 

In the case (i) one has 

- M-it-xl 2 ) + M-ft-iSl 2 < -co(l - Mg/ 2 ) + MgS /2 =: -c^, ( 24 ) 

where cq := inf{|(t5(a;)| : x G [—i?, i?] \ O}; obviously, cq > 0 . Let us fix a 
g G ( 0 , g) such that c'g > 0 . 

In the case (ii), designating bo := Yni\x\>RP^{x), one has 
-(p?_i(l - M-ft-i/ 2 ) + M^t-iS /2 < -boil - Mg/ 2 ) + MgS /2 =; -c". ( 25 ) 

Using A6, one gets that c" > 0 . 

Denote c = minjc^, c"}. The relations and (ESI) imply that if It_i = 1 
then — Mjt-i/ 2 ) + Mjt-iS/2 < —c < 0 , hence, by virtue of (ESI) . 


Et — Et-i < —c ■ E[7t_i Ij-i]. 

Summing up both sides of (ESI) over t — 1 ,..., s and denoting Iq 
00) = mint It, one obtains 


Eg — Eg < —c • E 




, 2=0 


( 26 ) 
I(r = 
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One has Eg > 0, and xq is bounded, hence Eg < oo. Thus, for arbitrary s 


E 


s-l 




Li=0 


Eo 

< — < oo. 
c 


This implies that a.s. either 7 ^ < 00 , or r = 00 . Lemma 3 is proved. □ 

Denote ci := 1 — Mg/2. Recall that / is the primitive of (p such that 
infa; f{x) = 0; the assumption A 6 implies that lima;^±oo f{x) = + 00 . Denote 
H ■= sup|3,|<^/(a;). Denote also C3 := g • sup{|(/5(a;)| : f{x) < H} + 1, z* := 
inf{a; : f{x) < H} — C3, z*" := sup{a; : f(x) < H} + C3, C2 := inf{|(^(a;)| : x G 
[z*, z’’] \ O}, and K := sup{|i^(a:)| : x G [z\ z’’]}. Obviously, ci > 0 and 
K > C 2 > 0. 

Fix an open set O containing Z. Let g > 0, 0<u><l. We shall say that a 
(finite or infinite) deterministic sequence {zq, 21,^2, ...} is {g, i(;)-admissible if 
|zo| < R and there exist deterministic sequences {qt}, {ht} such that 

1 ) \ht\ < w; 

2) if {zo, zi,..., zt} C [z\ z’'] \ O then g(f <qs<g, s = 0,1,..., t; 

3) Zt = zt-i - qt-i p{zt-i) - ht, t = 1,2,.... 


Proposition 1 There exists constants to and w such that any (g, w)-admissible 
sequence {zt, t = 0, 1,... ,to} has non-empty intersection with O. 

Proof. Let w := min{l, gfi^c|ci/(2K)}. Designate i = infjt : zt G O}; i takes 
values from {0, 1,.. . ,toj + 00 }. We shall use shorthand notation ft := f{zt), 
f[ = (ft ■= p{zt). One has 

ft = f{zt-i - qt-iPt-i - ht) = f{zt-i - qt-ipt-i) - f'{z).ht, (27) 


where z is a point between zt-i — qt-ipt-i and Zt_i — qt-ipt-i — ht- 
Next, one has 

f{zt-i - qt-ipt-i) = ft-i - ft_^qt-iPt-i + i/"(5) (28) 

where z is a point between zt-i and zt-i — qt-ipt-i- 
We are going to prove by induction that 

if 0 < s < f then fs<H — s- gd^C2Ci/2. (29) 

For s = 0, 123 follows from the condition |zo| < R and the definition of H. 
Now, let 1 < t < t; suppose that formula 123 is true for 0 < s < t — 1 and 
prove it for s = t. For 0 < s < t — 1, one has /(Zg) < H, Zg ^ O, therefore 
Zg G [z\ z'"] \ O; hence, by virtue of 2), gd"^ < 9 s < g for 0 < s < t — 1. One 
has f{zt-i) < H, \qt-i(pt-i\ < g • sup{|</j(a:)| : f{x) < H}, and \ht\ < w < 1, 
hence \qt-i(pt-i\ < C 3 , \qt-iTt-i + ht\ < C 3 , and so, Zt_i - qt-iPt-i G [z\ z^], 
Zt-\ — qt-iPt-i — ht & [z\ z^], thus z also belongs to [z*, z’’]. This implies that 
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\^{z)\ = \f'{z)\ < K. Then, combining (ITTIl and (EHll and using that \ht\ < w 
and |/"(z)| = \^'{z)\ < M, one obtains 

ft < ft-i - + wK. (30) 

One has zt-i G [zf z’’] \ O, hence \ip{zt-i)\ = > C 2 . Using also that 

qt-i > gdf, 1 — \qt-iM > ci, and wK < one gets from (Tinil that 

ft < ft-i — gd^clci/2, 

and using the induction hypothesis, one concludes that 

ft < H -t ■ gcPclci/2. 


Formula is proved. 

Let to := [ 2 H/{g(Pc 2 Ci)\ + 1; here [zj stands for the integral part of z. 
Then, taking into account that fs > 0, from (PI one concludes that t < to, 
thus Proposition 1 is proved. □. 

Proposition 2 If jt-i < 1/(3M), |^t| < C 2 , |6+i| < C 2 , Xt-i and Xt belong 
to [z\ z’'] \ O, then 7 t+i > 74 . 

Proof. Using notation ipt := ^(xt), one gets 

(fit = p{xt-i - jt-iipt-i + £.t)) = Pt-i - p'{x) ■ 7t_i((p4_i + it), 

where a; is a point between a;t_i and Xt- Therefore, 


ipt-ipt = Pt-i ■ [1 - p'{x)-ft-i ■ (1 + it/pt-i)]- 


Using that | 95 '(i)| < M, 7t_i < 1 /( 3 M), |^t| < C2, |<4at_i| > C2, one obtains 
1 — (p'{x)"ft-i ■ (1 + it/pt-i) > 1 / 3 , hence (pt-iipt > 0 . Further, using that 
l^tl < C2, |^4+i| < C2, \pt-i\ > C2, \pt\ > C2, one gets 

Vt Vt+i = Pt-iPt ■ (1 + it/‘Pt-i){^ + it+i/pt) > 0 . 

This implies that 74+1 = min{u74,g} > 74. □ 

Lemma 4 For any open set O, containing Z, and any g > 0 there exists 
S = 5{0,g) > 0 such that 

'If |a;o| ^ R, g then P(/or some t, Xt G O) > S. 

Proof. Without loss of generality suppose that g < 1 /( 3 M). Define the event 

^ := {l?i| < niin{c2, w/g}, i = 1,2,..., to}, 

where w and to are the same as in the proof of Proposition 1 : w = minjl, gd^c^ci/(2}Ii)}, 
to = [2H/{gd'^clci)\ + 1. 
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Denote 


S := P(A) = (P(|^i| < min{c2, ■u;/g}))*“; 

by virtue of A3 (a), S > 0. Let us show that for any elementary event cj € A, 
the sequence {zt = Xt{u}), t = 0, 1,..., to} is (g, w)-admissible. 

One has j^ol = |3;o(w)| < R. Further, one has Zt = Zt-i — qt-i^{zt-i) — ht, 
with qt-i = 7t_i(w), ht = 7t_i(w) and using that 7t_i(w) < g and 

|5j(w)| < w/g, one gets \ht\ < w. Thus, conditions 1) and 3) are verihed. 

Now, let {zo, zi, ..., Zt} C [z*, z’’] \ e>, t < to- Let sq G { 0 , 1 , 2, ..., t} 
be the minimal value such that qg^ = minjgo, 9i) • ■ • > 9t}- If So = 0 then 
minj^o, qi, - ■ ■ ,qt} = qo = 7o(w) >9> g(P- If So = 1 then min{go, gi, - - -, gj = 
gi = 7i(‘^) P gd > gd^. If So > 2 then > 1/(3M); otherwise, using 

that |Cso-il < C2, |6ol < C2, Xso_2(w) and Xso-i(uj) belong to [z\ z^] \ O, 
and applying Proposition 2, one would conclude that 7so(<^) ^ 7so-i(‘^)) which 
contradicts the definition of so- 

Thus, 7so(‘^) > 1/(3M) • (P > gcP, and therefore, minjgo, gi,..., gt} = 
7so(w) > gd^- So, the condition 2) is also verihed. 

Now, applying Proposition 1 to the (g, w)-admissible sequence {z*}, one 
concludes that there exists a non-negative t < to such that Zr = Xr(co) € O. 
This implies that 

P(for some t, Xt € O) > P(A) = S. 


□ 


Lemma 5 If 7 * = oo then for any open set O containing Z there exists t 
such that xt & O. 


Proof. Let us hx an open set O D Z, and denote 6 = 6{0,g{0)). Combining 
Lemma 3 and Lemma 4, one concludes that for any O Z) Z there exists (5 > 0 
such that whatever the initial conditions xo, 70, 7 i, 

P(for some t, Xt S O | ^ 7* = 00) > S. 

t 

Then one can choose a measurable integer-valued function dehned on 

M X (0,g] X (0,g] such that for i/ = n(xo,jo, 7 i) one will have 

P(for some t < n, Xt G O | ^ jt = 00) >5/2 

t 

Designate 

p = supP(for all t, Xt ^ o\^ 

i 

the supremum being taken over all the initial conditions xo, 70, 7 i- Fix xq, 70, 
71, then 

P(for all t, xt ^ O J2t 7i = 00) = 

= P(for all t > iz, xt ^ O for all t <v, xt ^ O and 7* = oo)- 
•P(for alH < I/, xt 0\ J2t 7 t = 00 ) <p{l — 5/2). 
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Taking supremum of the left hand side of m over all (a;o,70,71) G R x 
(0, g] X (0, g], one obtains p < p{l — 5/2), hence p = 0. Lemma 5 is proved. □. 

Denote O* = {a; : |</?(a;)| < L/2,}. 

Lemma 6 For any open bounded sets O, Oi such that O <Z Oi <Z O* and for 
any w > Q there exists 5 = 5{0, Oi,w) > 0 such that 


if Xq G O then P(/or some n, Xn G Oi and 'jn < w) > 5. 
Proof. Denote n = Ltj^TT^J + 2. Denote also 


e = min 


L d{0,M.\0i) 
2 ’ 


where d{A, B) := sup^g^ infygs |a: — ?/| for arbitrary sets of real numbers A, B. 
Using assumption A3 (a), one obtains that there exists (5i > 0 such that for any 
X G Oi and for any integer t, 


P ((-!)* V(a^) < (-1)‘6 < (-1)* V(a;) + ff) > 5i. 


This implies that if xq G O then 

P(0 < (-l)Vt < £) dist(a:t_i, O) < (t - l)ge, t = 1, 2, ..., n + 1) > 

Denoting J = 5"^^, one concludes that the following statements (i) and (ii) hold 
with probability at least 5: 

(i) dist(a:„, O) < nge < dist(C>, R\ Oi), hence Xn G Oi; 

(ii) as t = 2, 3,..., n + 1, one has yt-iyt < 0, hence 7 * = d'jt-i, therefore 
7n = d"“^7i < d"“^g < w. 

Lemma 6 is proved. □ 


Lemma 7 IfJ^t^t = 00 , O is an open set containing Z, and w > 0 then for 
some t, Xt-i G O and 74 < w. 


Proof. Without loss of generality, suppose that O is bounded and O C 
Choose an open set Oi such that Z C Oi, Oi C O; applying Lemmas 5 and 6 , 
one gets that for 5 = 5{0i,0, w) and for arbitrary initial conditions, 

P(for some t, Xt G O and 'jt < w) > 5. 

Repeating the argument of Lemma 5, one concludes that there exists t such that 
Xt G O and < w- D 

From now on we suppose that k > fc+(0). Choose k' such that /c+(0) < 
k' < k; using A3(b), one obtains that for some eo > 0, P(^i'f 2 > 0, or |^i| < 
£ 0 , or 1 ^ 2 ! < £ 0 ) < k'. Denote Oq = {x ■. \'p[x)\ < £ 0 } and r = inf{t : xt ^ Oq}. 
Without loss of generality, suppose that Oq is bounded. 
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Lemma 8 Suppose that k > fc+(0), then there exist a constant b > 0 and a 
monotone decreasing function p{-) such that linia^+oo p(a) = 0 and 

*/ lo < w then P(ln7f <hiv — bt for all t < t) > 1 — p(v/w). 

Proof. Define the sequences {pt} and {ct*} by 

pt = \nu- > 0, or < £0, or |^t| < Eq) + 

+ Ind • < 0 & > £o & 161 > £o), 

t 

at = Intc + ^ Pi. 

i=l 

Using and definition of r, one obtains that for all t < t, If crt- The 
variables pt are identically distributed, take the values Inu and Ind, and 

Ept = Inu • P(6-i6 > 0: or |6-i| < £o, or 161 < £o) + 

+ Ind- P(6-i6 < 0 & |6-i| > £o & 161 > £o) < 

< Inu • + Ind • (1 ~ • k + Ind • (1 ~ k) = 0. 

Moreover, the variables in the set {pt, t even}, as well as the variables in the 
set {pt, t odd}, are independent. 

Denote b = —Ept/2. One has 

P(ln7t <hiv — bt for all t < r) > P (cr* < In u — for all t) = 

t 

= P(^(pi + 26) < Inu — lnu> + 6t for all t) > 1 — p{v/w), 

i=l 

where p{a) = pi{a) +p 2 {a), 

Pi (a) = P j X! (P* + 2^) > ^ + ^t for all t 

P 2 {a) = P j X! (Pi + 26) > ^ + ^t for all t 

the sum (Y”) is taken over the even (odd) values of i. Both Y' Y” 
are sums of i.i.d.r.v. with zero mean, hence both pi(a) and P 2 (fl) tend to zero 
as a —> +CX). Lemma 8 is proved. □ 

Define the stopping times Ty = inf{t : xt ^ Oq or In74 > Inu — bt}. Recall 

that / is the primitive of (p such that infa, f{x) = 0. Fix an open set O' 

such that Z C O' C Oq and sup^^g^/ f{x) < infx^Oo fi^)^ denote 6 = 
infx^Oo fi^) - sup^gQ/ f{x). 
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Lemma 9 Let k > k+{0), xq € O', and 70 < w, then 

P(r„ < 00 ) < K + p{v/w); 

here K is a positive constant, and p{-) satisfies the statement of lemma 8. 

Proof. We shall use shorthand notation of Lemma 3: ft '■= f{xt) and tpt ■= 
(p{xt). According to (1^ . one has 

ft - ft-i < + ft) + Y7i-i(V5t-i +6)^ < 

< —'yt-iPt-ift + + ff). 

This implies that /t — /i < QJ + with 


t t 

Q[ = Qt=MY^ jlAph+f!)- 

i—2 i—2 

Using Lemma 8 , one gets 

P(t« < 00 ) < p{v/w) + P' + P", 


where 

P' = P(q;^ > s/2) and P" = P{Q';^ > S/2). 
According to the Chebyshev inequality, 

A A ^ 

p' <pEQ?. = pi: E.,. 

ij = l 


where 


Eij = E I(i - 1 < Tv) ■ 'Yj-ipj-ifj l{j - 1 < Tv)]. 


Using that the values 7 ^, ipi, ft, and I(i < Tv) are iFi-measurable, and using 
assumptions A1 and A2, one obtains that for i j, Eij = 0, and for i = j, 

Eii = E [ 7 f_i</ 5 -_i I(i - 1 < Tv) ■ f/] < sup (p'^{x) ■ S. 

xGOq 


Therefore, 


p' ^ 


i=2 


4v'^S 



sup (p^{x). 
xeOo 


Similarly, 


P” < tEQ'/ 

s 


9 M . 
2=2 
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< 


2 Mv^ 


E' 

i =2 


—2bi 


sup + S') = 


xGOq 


2 Mv^ 


,-46 


1 — e 


-26 I ^ ix) + S 

' xeOo 


Taking 


sup (p'^{x) + S 

xGOq 

one gets that P' + P" < K . Lemma 9 is proved. □ 

Lemma 10 //k > A:+(0) then < ^■ 


K = 


4S 


2M 


— sup (x) + — 
0 xeOo 



Proof. From the definition of Tv one easily sees that if Ti, = oo for some 
V > 0, then 7* < This implies that for any u > 0 

P (^ 7 t = oo^ < P(r„ = oo). (32) 

Further, by virtue of Lemma 9, if xq € O' and 70 < re then 

< 00 ) < Kw +p{l/y/w). (33) 

Combining 133 and one gets that for any w > 0 

P (^^ 7 t = 00 I Xq G O' and 70 < < Kw + p{l/^/w). (34) 

Define the event = { for some t, Xt G O' and 7 ^ < w}, then by virtue of 

CT . 

P fy^ 7 t = 00 I Aw] < Kw +p{l/y/w). (35) 

Denote by Aw the complementary event, Aw = { for any t, Xt ^ O' or 7 t > ui}. 
By virtue of Lemma 7, 

P 7 t = 00 & Aw'^ = 0. (36) 

Using (I35II and ra . one gets 

P 7 t = 00 ^ = P 7 t = 00 & Aw^ + P 7 t = 00 & An}j < 

< {Kw +p{l/^/w)) ■ P(7lu,). 

Taking into account that w can be chosen arbitrarily small and that Kw + 
p{\/^/w) ^ 0 as ic ^ 0+, one concludes that P (^^ 74 = 00 ) = 0. □ 

Now, we are in a position to prove the theorem. Suppose that k < inf^ k- (z), 
then = 0, and by Lemma 2, {xt\ diverges. So, the statement (b) of Theorem 
is proved. 

On the other hand, according to Lemma 10, if k > k+{0) then 7* < 
and by Lemmas 1 and 2, the sequence {xt} converges to a point from 
Thus, the statement (a) of theorem is also established. 
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